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• Premise of the study: Because plant identification demands extensive knowledge and complex terminologies, even professional 
botanists require significant time in the field for mastery of the subject. As plant leaves are normally regarded as possessing 
useful characteristics for species identification, leaf recognition through images can be considered an important research issue 
for plant recognition. 

• Methods: This study proposes a feature extraction method for leaf contours, which describes the lines between the centroid and 
each contour point on an image. A length histogram is created to represent the distribution of distances in the leaf contour. 
Thereafter, a classifier is applied from a statistical model to calculate the matching score of the template and query leaf. 

• Results: The experimental results show that the top value achieves 92.7% and the first two values can achieve 97.3%. In the 
scale invariance test, those 45 correlation coefficients fall between the minimal value of 0.98611 and the maximal value of 
0.99992. Like the scale invariance test, the rotation invariance test performed 45 comparison sets. The correlation coefficients 
range between 0.98071 and 0.99988. 

• Discussion: This study shows that the extracted features from leaf images are invariant to scale and rotation because those 
features are close to positive correlation in terms of coefficient correlation. Moreover, the experimental results indicated that 
the proposed method outperforms two other methods, Zernike moments and curvature scale space. 

Key words: classifier of statistical model; edge detection; feature extraction; leaf recognition. 



Because plant identification demands extensive knowledge 
and uses complex terminology, even professional botanists 
need to take much time in the field to master plant identification 
(Rademaker, 2000). Plant identification by information systems 
has often been regarded as a possibility. By employing personal 
digital devices to photograph the whole plant or a portion of the 
plant, information systems can be used to perform plant recog- 
nition. Plants may be recognized through the leaves, flowers, 
roots, and fruits, which reflect the diversity of plant shapes 
available within an organism. In particular, the shape of leaves 
and the floral organs — the modified leaves — are especially im- 
portant (Tsukaya, 2006), with the leaves considered an espe- 
cially useful characteristic for species identification (Gu et al., 
2005; Du et al., 2007; Wu et al., 2007). For example, the free 
mobile app Leaf snap (http://leafsnap.com) has been devel- 
oped to identify tree species from photographs of their leaves. 
Marcysiak (2012) examined the morphology of Salix herbacea 
L. leaves for intraspecific morphological variation. A total of 
3890 leaves from 503 individuals were statistically analyzed 
based on leaf shape characters. A notable variation of shape 
characters of leaves of S. herbacea was found on different levels, 
including intra- and interindividual samples. For example. Galling 
et al. (2012) identified morphological species and differentiation 
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patterns on two species, Q. rubra L. and Q. ellipsoidalis E. J. 
Hill, which hybridize with each other. The two plant species 
were identified as two clusters when leaf morphological charac- 
ters were measured. Furthermore, two populations of Q. ellip- 
soidalis were differentiated from eight other populations 
through analysis of leaf morphological characters. Therefore, 
leaf recognition through images can be considered an important 
research issue for plant recognition. 

Shape is one of the most important features for describing an 
object. Humans can easily identify various objects and classify 
them into different categories solely from the outline of an ob- 
ject. Shape often carries several types of contour information, 
which are used as distinctive features for the classification of an 
object. In the MPEG-7 standard, shape descriptors can be di- 
vided into region-based shape descriptors and contour-based 
shape descriptors (Zhang and Lu, 2003a). Region-based shape 
descriptors such as Zernike moments (Wee and Paramesran, 
2007) describe a shape based on both boundary and interior 
pixel information. Region-based shape descriptors can be used 
to depict several complex objects with filled regions (Bober 
et al., 2002), and can capture both the interior contents and bound- 
ary information of an object in an image. However, contour- 
based descriptors only exploit the boundary information of an 
object, and include the conventional representation and struc- 
tural representation. Conventional descriptors such as curvature 
scale space (CSS) (Mokhtarian et al., 2005) retain the overall 
shape of an object during calculation. Structural descriptors 
such as chain code fragment the shape of an object into differ- 
ent boundary segments (Zhang and Lu, 2003b). 
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Because the morphology of leaves is commonly used for 
plant identification, the studies shown in Table 1 have exam- 
ined the shape and morphological description for plant leaves. 
As leaf recognition can be regarded as an image classification 
issue, various types of neural networks were proposed for iden- 
tifying the species to which a given leaf belongs. Chaki and 
Parekh (2011) presented a schematic for the automated detec- 
tion of three classes in a plant species by analyzing the shapes 
of leaves and using several neural network classifiers. Gao et al. 
(2010a) proposed a neural network classifier based on prior 
evolution and iterative approximation for leaf recognition. 
Huang and He (2008) applied probabilistic neural networks for 
the recognition of 30 types of broad-leaved trees. Furthermore, 
Wu et al. (2007) also introduced the probabilistic neural net- 
work to classify 32 types of plants. Other various classification 
methods were proposed for leaf recognition in addition to neu- 
ral networks. Ehsanirad (2010) trained a classifier to categorize 
13 types of plants with 65 new or deformed leaves during the 
testing process. In the Du et al. (2007) study, a moving median- 
centered hyper sphere classifier was adapted to perform the 
classification. Hajjdiab and Al Maskari (201 1) presented an ap- 
proach for identifying leaf images based on the cross -correla- 
tion of distances from the centroid to the leaf contour. 

Feature extraction for leaf images requires consideration of 
which features are most useful for representing the leaves and 
which methods can effectively code leaf morphologies (Wu 
et al., 2006). A leaf of a given species normally represents a 
specific shape or contour; therefore, this characteristic is a reli- 
able and meaningful indicator for leaf representation. The main 
contribution of this study is to propose a feature extraction 
method for leaf contours that describes these significant turning 
points. Moreover, a classifier of a statistical model is proposed 
for similarity matching with different numbers of features. 



MATERIALS AND METHODS 

Leaf recognition framework — The leaf recognition framework was divided 
into leaf modeling and leaf recognition. For leaf modeling, leaves belonging to 
the same species were used to detect and extract leaf features. The extracted fea- 
tures were then used for leaf modeling, creating a leaf model for each leaf species 
in the database. During leaf recognition, a query leaf was also tested by detecting 
feature points and feature extraction. Using these features, the recognition system 
can identify the best matching model and recognize the species of the query leaf. 

Object contour — The contour of object O in image / can be detected to 
generate the set which collects all contour points p in a Cartesian coordinate 
system. These contour points can be used to calculate the centroid C of the ob- 
ject using Equation 1. 



Table 1. Methods and features used in leaf recognition studies. 



Recognition method/feature 



Reference 



(1) 



where |^| represents the number of edge points in set ^. All contour points are 
collected in a clockwise order and stored in set ^. As several segments of an 
object contour contain redundant points, these redundant points can be removed 
through sampling. The sampling process is to select the contour points from 
every five points in the set ^. Thereafter, the selected points are stored in another 
set S. Figure 1 illustrates the process of detecting contour points. The contour 
points of the leaf in Fig. lA are sampled to result in Fig. IB. 



Neural network 

Moment invariants 
Centroid-Radii model 
Score of cross-correlation 

Length of contour points to centroid 

Classifier 

Textural features of gray-level co-occurrence 

matrices 
Neural network 
Standardized matrix 
Angle of the leafstalk point 
Angle of the tip point 
Angle of the lowest point 
Aspect ratio 

Approximate circle factor 
Differential angle of the petiole point 
Differential angle of the tip point 
Distance of similar measure 
Ratio of length and width 
Ratio of the area of the upper part and the area of 

the lower part 
Probabilistic neural network 
Aspect ratio 
Rectangularity 

Ratio of the square of perimeter and the area 
Probabilistic neural network 

Label values of nervation types 
Fractal dimension of vein image 
Rectangularity 
Circularity 
Sphericity 
Eccentricity 
Axis ratio 
Convexity area 
Convexity perimeter 
Probabilistic neural network 
Diameter 

Physiological length 

Physiological width 

Leaf area 

Leaf perimeter 

Smooth factor 

Aspect ratio 

Form factor 

Rectangularity 

Narrow factor 

Perimeter ratio of diameter 

Perimeter ratio of physiological length and 

physiological width 
Move median centers hypersphere classifier 
Aspect ratio 
Rectangularity 
Area ratio of convex hull 
Perimeter ratio of convex hull 
Sphericity 
Circularity 
Eccentricity 
Form factor 
Invariant moments 
Neural network 
Slimness 
Roundness 
SoHdity 

Moment invariants 



Chaki and Parekh, 2011 



Hajjdiab and Al 
Maskari, 2011 

Ehsanirad, 2010 



Gao et al., 2010a 



Liao et al., 2010 



Gao et al., 2010b 



Huang and He, 2008 



Wu et al., 2007 



Du et al., 2007 



Wu et al., 2006 



Feature extraction — In the object contour, straight lines are created be- 
tween centroid C and each contour point p. Thereafter, the lengths of the straight 
Hnes can be calculated. Suppose that a set of contour points is S= [p^,p2,...,p„}. 
The fine length lerii can be computed as 
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Fig. 1. Detection of contour points. (A) Contour and centroid C of 
leaf. (B) Sampling result of contour points. 



len.=\Cp\ yp^eS 



(2) 



The distance features are normalized to create a histogram that represents the 
distribution of distances in the object contour. All Leui are divided by the great- 
est Len^^^ and collected in R to normalize the length features. 



R = jr. I r. = Len. I Len^^^ } 



(3) 



Intradifference, the difference in a leaf species at individual leaves, may cause 
mistaken recognition. To deal with the intradifference problem and make the 
classification stable, the proposed feature is processed through the fuzzy logic 
method. The degrees of probabiHty from probabiHstic logic (Lukasiewicz and 
Straccia, 2009) is introduced into the histogram, where the frequency of each 
bin is replaced by fuzzy scores. The fuzzy score algorithm transforms the nor- 
malized features into fuzzy scores as shown in the algorithm in Appendix 1 . 
For example, the feature value of A is 4.25 and it is transformed into two fuzzy 
values [0.5, 0.5]. The two fuzzy values are accumulated into bins [3,4] and [4,5] 
in the histogram. For point B, three fuzzy values are [0,1,0] for bins [3,4], [4,5], 
and [5,6]. Two fuzzy values of point C are [0.3, 0.7] for bins [4,5] and [5,6]. 
Figure 2 shows that three feature values are transformed into fuzzy values. Due 
to the G [0.1], the range of the normalized value is divided into N classes, 
which is set as A/^ = 24 in this study. The j represents an array and is assigned 
to the given class based on the following rules v[»]: 



v[0] = v[0]-M, ifO<r. . 



1 

2N 



vly-ll = vly-ll + |^-r.x7V 



2 

27-11 



v\j\ = v\j\ + \^^-rxN 



v[7] = v[7]-F r.xTV 



v[j + \\ = v[j + \] + \rxN 



2J + 1 



,ifr,<^l^J^\l,...,N-l] 

' 2N ^ ^ 



,ifr.>^,7E[l,...,7V-ll 

' IN ^ ^ 



(4) 



y\N-\\ = v\N-\\ + \, if 1 <r <\ 




A=4.25 



4.0 "-"""4.5 ^-^^ 5.0 

Fig. 2. Probabilistic logic diagram. 
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Fig. 3. Thirteen species of plant leaves collected for this study, includ- 
ing sample leaves and feature histograms. 
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Table 2. Recognition results of the proposed features for the training set and the test set. 



Species 


Training set 


Testing set 


Basella alba 


(lo, Z, (J, (J, (J, (J, (J, (J, (J, (J, (J, (J, (J) 


(Id, 3, U, U, (J, (J, (J, (J, (J, (J, (J, (J, (J) 


Rosa rugosa 


(lo, z, U, U, U, U, U, U, U, U, U, U, U) 


/in 1 c\ (\ c\ c\ c\ c\ c\ c\ c\ c\ (\\ 
(19, 1, U, U, U, U, U, U, U, U, U, U, U) 


Gynura bicolor 


^ A £. A r\r\r\r\r\r\r\r\r\r\ r\\ 

(16, 4, U, U, U, 0, U, U, U, U, U, U, U) 


^ A A o o 1 r\r\r\r\r\r\r\r\ r\\ 

(14, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Morus alba 


(zU, U, U, U, (J, (J, U, U, U, U, U, U, U) 


/OA C\(\(\(\(\(\(\(\(\(\C\ (W 

(zU, U, U, U, U, U, U, U, U, U, U, U, U) 


Coleus amboinicus 


(19, U, 1,0, U, 0, U, U, U, U, U, U, U) 


/orv r\ f\ f\ f\ f\ f\ f\ f\ f\ f\ f\ f\\ 

(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Nymphaea tetragona 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Salix argyracea 


(16, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(19, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Capsicum annuum 


(19, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(16, 0, 3, 1,0, 0, 0, 0, 0, 0, 0, 0, 0) 


Ipomoea batatas 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Ipomoea aquatica 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Eucalyptus globulus 


(18, 1, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Aglaia odorata 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Impatiens walleriana 


(18,2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(18, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Total 


(242, 16, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(241, 12,5,2, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Total (%) 


(93.1, 6.1, 0.8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(92.7, 4.6, 1.9, 0.8, 0, 0, 0, 0, 0, 0, 0, 0, 0) 



Each object can result in a histogram that represents information regarding 
the contour. Therefore, these resulting histograms can be used to estimate the 
matching degree between any two objects. 



/ =argmaxfj/(x. it;.) 



(8) 



Classifier of statistical model — Once the leaf features X= (xpX2,...,x^) are 
extracted from a leaf, the leaf classifier can be expressed using the following 
equation: 



= argmaxP(7;. |X) 



(5) 



where 7^ is the model of leaf / and P (T. \ Xj is the discriminant function of 7^. 
Bayes' theorem indicates 



, , f(X\T.)xP(T.) 
p(T\x) = ' 7 , ^ 



(6) 



where/(') is the probability density function. The^ f{^) is the common term 
for identifying the maximum probability because / is estimated. If we assume 
a uniform prior probability P (r. ) on the species identity, the discriminant func- 
tion in Equation 5 can be simplified as 



^argmax f[X\T.) 



(7) 



If X is distributed normally with mean and variance d^, then /(x) ~nIjU,o^ ) 



/(■^i^)=ri^=«p 

7=1 y27r(j/ 



2(t/ 



(9) 



To compute the exponential value efficiently, we use the logarithm of the dis- 
criminant function 



Log{f(X\T,)) = -^j: 



+ Logilira^. 



(10) 



which is referred to as score function. Thereafter, c sample leaves of each spe- 
cies in the training set are used to estimate the parameters ju. and cr/ of each 7) 
as follows: 



To reduce computational complexity, we further assume that x^,X2,...,x^ are 
mutually independent features. Equation 3 can be transformed into Equation 4 



(11) 



Table 3. Recognition results of Zemike moments for the training set and the test set. 



Species 


Training set 


Testing set 


Basella alba 


(14, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(9,3,0, 0, 2, 0, 0, 0, 1,0, 0, 0, 0) 


Rosa rugosa 


(10, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(8, 4, 1,0, 2, 0, 0, 0, 0, 0, 0, 0, 0) 


Gynura bicolor 


(10, 4, 0, 1,0, 0, 0, 0, 0, 0, 0, 0, 0) 


(5,3, 1,2,3, 1,0, 0, 0, 0, 0, 0, 0) 


Morus alba 


(14, 0, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(10, 3, 1, 1,0, 0, 0, 0, 0, 0, 0, 0, 0) 


Coleus amboinicus 


(13, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(11,3, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Nymphaea tetragona 


(14, 0, 0, 0, 1,0, 0, 0, 0, 0, 0, 0, 0) 


(14, 0, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Salix argyracea 


(16, 3, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(3, 8, 1,3,0, 0, 0, 0, 0, 0, 0, 0, 0) 


Capsicum annuum 


(10, 3, 1, 1,0, 0, 0, 0, 0, 0, 0, 0, 0) 


(10, 2, 0, 0, 1,0, 1, 1,0, 0, 0, 0, 0) 


Ipomoea batatas 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Ipomoea aquatica 


(13, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(12, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Eucalyptus globulus 


(11, 1,3,0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(11,3, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Aglaia odorata 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Impatiens walleriana 


(14, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(14, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Total 


(159, 22, 11,2, 1,0, 0, 0, 0, 0, 0, 0, 0) 


(137, 33,7, 6, 8, 1, 1, 1, 1,0, 0, 0, 0) 


Total (%) 


(81.5, 11.3, 5.6, 1, 0.5, 0, 0, 0, 0, 0, 0, 0, 0) 


(70.3, 16.9, 3.6, 3.1, 4.1, 0.5, 0.5, 0.5, 0.5, 0, 0, 0, 0) 
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Table 4. Recognition results of curvature scale space for the training set and the test set. 



Species 


Training set 


Testing set 


Basella alba 


{Id, U, (J, (J, U, (J, (J, (J, (J, (J, (J, (J, (J) 


U, (J, (J, (J, (J, (J, (J, (J, U, U, (J, U) 


Rosa rugosa 


(ij, U, U, U, U, U, U, U, U, U, U, U, U) 


1 f\(\(\(\c\c\c\r\r\c\ r\\ 
(14, i, U, U, U, U, U, U, U, U, U, U, U) 


Gynura bicolor 


(15, U, 0, 0, U, U, U, U, U, U, U, U, U) 


(10, 5, 0, 0, 0, 0, 0, 0, 0, U, U, 0, U) 


Moms alba 


(15, U, (J, (J, U, U, U, U, U, U, U, U, U) 


(15, U, U, U, U, U, U, U, U, U, U, U, U) 


Coleus amboinicus 


(14, 1, U, U, U, U, U, U, U, U, U, U, U) 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Nymphaea tetragona 


(13, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(13, 1, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Salix argyracea 


(13, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(9, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Capsicum annuum 


(11,4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(6, 7, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Ipomoea batatas 


(14, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(11,2, 1, 1,0, 0, 0, 0, 0, 0, 0, 0, 0) 


Ipomoea aquatica 


(14, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(13, 1, 1,0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Eucalyptus globulus 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Aglaia odorata 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Impatiens walleriana 


(15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(12, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


Total 


(184, 11,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(163,26,5, 1,0, 0, 0, 0, 0, 0, 0, 0, 0) 


Total (%) 


(94.4, 5.6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 


(83.6, 13.3, 2.6, 0.5, 0, 0, 0, 0, 0, 0, 0, 0, 0) 



(12) 



compute /Lij and crj for each plant species. Moreover, mean c^^ 
and variance c^^^ of centroid Q are computed for each leaf. 



where x/'^ represents the ^-th feature of the j-th sample leaves in the /-th species. 



(13) 



RESULTS AND DISCUSSION 



This study examined 13 species of fresh plant leaves as 
shown in Fig. 3. This figure also includes some sample leaves 
and the feature histogram of a given leaf. For each species, sep- 
arate images of 40 plant leaves were used to evaluate the pro- 
posed features and algorithms. The first 20 images in each 
species are regarded as the training set and the last 20 images 
are the test set. Furthermore, a feature histogram v[»] was cre- 
ated for all leaves. Equation 11 and Equation 12 are applied to 



(14) 



Table 2 shows that the recognition results for the training set and 
test set are indicated as a tredecuple ordered list of correct rep- 
resentatives. The ordered list reports the result of the recognition 
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Fig. 4. Two leaf contour images and their corresponding feature histograms. Although the two leaves belong to the same species, their histograms 
present two greatly different feature curves. 
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Fig. 5. A binary leaf image presented at sizes from 90% to 10% of the original size to verify scale invariance, with their corresponding histograms. In 
these histograms, the horizontal axis and vertical axis represent feature number and feature value, respectively. 



results where the first position is the correct identification of the 
plant species. The listed second position is the recognition re- 
sult identifying the plant species as the second probable plant 
species. It is expected that the correct representative should be 
ranked as high as possible. The results in Table 2 show that the 
top value of the tredecuple reaches 93.1% and the first two can 
even achieve 99.2% for the training set. In comparison with the 
test set, the top value achieves 92.7% and the first two values 



can achieve 97.3%. The recognition performances for the train- 
ing set and test set are substantially close. 

Zernike moments and curvature scale space are two popu- 
lar methods that are both invariant to scale and rotation and 
were tested in the same experimental setup. The Zernike mo- 
ments derive from a set of complex polynomials orthogonal 
over the interior of a unit circle and defined in the polar co- 
ordinates. The recognition results of the two methods for the 
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Fig. 6. A binary leaf image rotated clockwise from 10° to 90° to verify 
horizontal axis and vertical axis represent feature number and feature value. 



training set and test set are shown on Table 3 and Table 4. If 
we compare the recognition rate for the first probable plant 
species, the results shown in Tables 2-4 indicate that the pro- 
posed method outperforms Zernike moments and curvature 
scale space. 

Numerous leaves belonging to the same species may still 
possess great differences in contour. For example, Fig. 4 shows 
two leaf contours and their corresponding feature histograms. 
Although the two leaves belong to the same species, their histo- 
grams present two greatly different feature curves. An errone- 
ous recognition happens when the feature curve of a given leaf 
is closer to the model of another species than that of the correct 
species. The problem would be solved by building multiple 
models for the same species, which is a potential research issue 
for other researchers to investigate. 



rotation invariance, with corresponding histograms. In these histograms, the 
respectively. 



The experimental results indicate that the correct recognition 
rate is 92.7% if we strictly examine the first-position plant of 
the recognition result. In other words, the erroneous recognition 
rate is approximately 7.3%. The cause of the erroneous recogni- 
tion may involve the use of the parameter N in Equation 4, 
which in feature extraction may affect the fuzzy feature. When 
N is set higher, the leaves belonging to the same species are re- 
garded as different species. When A/^ is set lower, the leaves be- 
longing to the different species are seen as same species. The 
parameter determination issue is also similar to the length of an 
interval for sampling contour points. 

Scale invariance — To verify the scale invariance, a binary 
image was shrunk to various sizes from the original image 
(from 90% to 10%). The features of the different- sized images 
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were extracted to create their corresponding histograms as 
shown in Fig. 5. Correlation coefficients were computed for the 
similarity of any two scale ratios. This test was performed on 45 
comparison sets. These 45 correlation coefficients fell between 
the minimal value 0.98611 and the maximal value 0.99992, in- 
dicating a strongly positive correlation. The results indicate the 
10 feature histograms are very similar in terms of correlation 
coefficients. The curve in the feature histogram does not fluctu- 
ate considerably even when the image is shrunk to 10% of the 
original scale. These results also confirm that the proposed fea- 
tures are invariant to scale. 

Rotation invariance — To verify the rotation invariance, a bi- 
nary image was rotated clockwise to various degrees from the 
original degree (from 10° to 90°). The features of the rotated 
images were extracted to create their corresponding histograms 
as shown in Fig. 6. Like the scale invariance test, the rotation 
invariance test was performed for 45 comparison sets using cor- 
relation analysis. The range of the correlation coefficients was 
between 0.98071 and 0.99988. These results indicate that the 
curves of these histograms have a very similar appearance, in- 
dicating the property of rotation invariance in the proposed 
features. 



CONCLUSIONS 

This study presents a feature extraction method for shape de- 
scription and a classifier of a statistical model for different fea- 
ture dimensions. The extracted features are invariant to scale 
and rotation, and the proposed method outperforms Zernike 
moments and curvature scale space. If the shape of leaves within 
a species varies substantially, multiple leaf templates are sug- 
gested for creating the species leaf model. We will extract more 
features from the patterns of the leaf vein and positions of the 
petioles of leaves in a future study to improve recognition 
performance. 
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Appendix 1 . The fuzzy score algorithm. 
Begin 

Create an A^- dimensional matrix v[»] 
Move the cursor to the first inR 

Run the following step until the cursor moves to the last one. 

Begin 

iBin = Floor (r. x N) 

Ifr <^Then 

' 27V 

v[0] = v[0] + l 

Elselfr >1 — !-Then 

27V 

v[7V-l] = v[^-l] + l 



Begin 

Mid=(iBin+0.5)/N 
If r^< Mid Then 
Begin 

V [iBin -l] = v [iBin - 1] + [Mid -r.)xN 
V [iBin] = V [iBin] + r. - Mid + — x 



V [iBin + l] = v [iBin + + (r. - Mid) x TV 
End 

End 

Move the cursor to the next r. 



Else 



End 



Else 




End 



End 
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